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Abstract. The degree of a CSP instance is the maximum number of times that a variable 
may appear in the scope of constraints. We consider the approximate counting problem 
for Boolean CSPs with bounded-degree instances, for constraint languages containing the 
two unary constant relations {0} and {1}. When the maximum degree is at least 25 we 
obtain a complete classification of the complexity of this problem. It is exactly solvable 
in polynomial-time if every relation in the constraint language is affine. It is equivalent 
to the problem of approximately counting independent sets in bipartite graphs if every 
relation can be expressed as conjunctions of {0}, {1} and binary implication. Otherwise, 
there is no FPRAS unless NP = RP. For lower degree bounds, additional cases arise in 
which the complexity is related to the complexity of approximately counting independent 
sets in hypergraphs. 



1. Introduction 

In the constraint satisfaction problem (CSP), we seek to assign values from some domain 
to a set of variables, while satisfying given constraints on the combinations of values that 
certain subsets of the variables may take. Constraint satisfaction problems are ubiquitous in 
computer science, with close connections to graph theory, database query evaluation, type 
inference, satisfiability, scheduling and artificial intelligence [20, 22, 25J. CSP can also be 
reformulated in terms of homomorphisms between relational structures [13] and conjunctive 
query containment in database theory [20]. Weighted versions of CSP appear in statistical 
physics, where they correspond to partition functions of spin systems |31j . 
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We give formal definitions in Section [2] but, for now, consider an undirected graph G 
and the CSP where the domain is {red, green, blue}, the variables are the vertices of G 
and the constraints specify that, for every edge xy G G, x and y must be assigned different 
values. Thus, in a satisfying assignment, no two adjacent vertices are given the same colour: 
the CSP is satisfiable if, and only if, the graph is 3-colourable. As a second example, given 
a formula in 3-CNF, we can write a system of constraints over the variables, with domain 
{true, false}, that requires the assignment to each clause to satisfy at least one literal. 
Clearly, the resulting CSP is directly equivalent to the original satisfiability problem. 

1.1. Decision CSP 

In the uniform constraint satisfaction problem, we are given the set of constraints ex- 
plicitly, as lists of allowable combinations for given subsets of the variables; these lists can 
be considered as relations over the domain. Since it includes problems such as 3-SAT and 
3-COLOURABILITY, uniform CSP is NP-complete. However, uniform CSP also includes 
problems in P, such as 2-SAT and 2-COLOURABILITY, raising the natural question of what 
restrictions lead to tractable problems. There are two natural ways to restrict CSP: we can 
restrict the form of the instances and we can restrict the form of the constraints. 

The most common restriction to CSP is to allow only certain fixed relations in the 
constraints. The list of allowed relations is known as the constraint language and we write 
CSP(r) for the so-called non-uniform CSP in which each constraint states that the values 
assigned to some tuple of variables must be a tuple in a specified relation in T. 

The classic example of this is Schaefer's dichotomy for Boolean constraint languages V 
(i.e., those with domain {0, 1}; often called "generalized satisfiability") [26J. He showed that 
CSP(r) is in P if r is included in one of six classes and is NP-complete, otherwise. More 
recently, Bulatov has produced a corresponding dichotomy for the three-element domain [2] . 
These two results restrict the size of the domain but allow relations of arbitrary arity in 
the constraint language. The converse restriction — relations of restricted arity, especially 
binary relations, over arbitrary finite domains — has also been studied in depth [TB]I17|. 

For all r studied so far, CSP(r) has been either in P or NP-complete and Feder and 
Vardi have conjectured that this holds for every constraint language |14j . Ladner has shown 
that it is not the case that every problem in NP is either in P or NP-complete since, if 
P 7^ NP, there is an infinite, strict hierarchy between the two [23J. However, there are 
problems in NP, such as graph Hamiltonicity and even connectedness, that cannot be 
expressed as CSP(r) for any finite and Ladner 's diagonalization does not seem to be 
expressible in CSP [H] , so a dichotomy for CSP appears possible. 

Restricting the tree-width of instances has also been a fruitful direction of research 
[15|J21|. In contrast, little is known about restrictions on the degree of instances, i.e., the 
maximum number of times that any variable may appear. Dalmau and Ford have shown 
that, for any fixed Boolean constraint language V containing the constant unary relations 
-Rzero = {0} and i?one = {!}; the complexity of CSP(r) for instances of degree at most 
three is exactly the same as the complexity of CSP(r) with no degree restriction [6]. The 
case where variables may appear at most twice has not yet been completely classified; it is 
known that degree-2 CSP(r) is as hard as general CSP(r) whenever V contains R zcro and 
i? one and some relation that is not a A-matroid [13]; the known polynomial-time cases come 
from restrictions on the kinds of A-matroids that appear in T [6]. 



This follows from results on the expressive power of existential monadic second-order logic |12j . 
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1.2. Counting CSP 

A generalization of classical CSP is to ask how many satisfying solutions there are. 
This is referred to as counting CSP, #CSP. Clearly, the decision problem is reducible to 
counting: if we can efficiently count the solutions, we can efficiently determine whether there 
is at least one. The converse does not hold: for example, we can determine in polynomial 
time whether a graph admits a perfect matching but it is #P-complete to count the perfect 
matchings, even in a bipartite graph [29\ . 

#P is the class of functions / for which there is a nondeterministic, polynomial-time 
Turing machine that has exactly f(x) accepting paths for input x |28j. It is easily seen 
that the counting version of any NP decision problem is in #P and #P can be considered 
the counting "analogue" of NP. Note, though that problems that are #P-complete under 
appropriate reductions are, under standard complexity-theoretic assumptions, considerably 
harder than NP-complete problems: P# p includes the whole of the polynomial hierarchy 
|27j . whereas P NP is generally thought not to. 

Although no dichotomy is known for CSP, Bulatov has recently shown that, for all 
r, #CSP(r) is either computable in polynomial time or #P-complete [3]. However, Bu- 
latov's dichotomy sheds little light on which constraint languages yield polynomial-time 
counting CSPs and which do not. The criterion of the dichotomy is based on "defects" in 
a certain infinite algebra built up from the polymorphisms of T and it is open whether the 
characterization is even decidable. It also seems not to apply to bounded-degree #CSP. 

So, although there is a full dichotomy for #CSP(r), results for restricted forms of 
constraint language are still of interest. Creignou and Hermann have shown that only one of 
Schaefer's polynomial-time cases for Boolean languages survives the transition to counting: 
#CSP(r) G FP (i.e., has a polynomial time algorithm) if T is affine (i.e., each relation is 
the solution set of a system of linear equations over GF2) and is #P-complete, otherwise [5]. 
This result has been extended to rational and even complex-weighted instances [4|ll0j and, in 
the latter case, the dichotomy is shown to hold for the restriction of the problem in which 
instances have degree 3. This implies that the degree-3 problem #CSP3(r) (#CSP(r) 
restricted to instances of degree 3) is in FP if T is affine and is #P-complete, otherwise. 

1.3. Approximate counting 

Since #CSP(r) is very often #P-complete, approximation algorithms play an impor- 
tant role. The key concept is that of a fully polynomial randomized approximation scheme 
(FPRAS). This is a randomized algorithm for computing some function f(x), taking as its 
input x and a constant e > 0, and computing a value Y such that e~ e ^ Yj f(x) ^ e e with 
probability at least |, in time polynomial in both \x\ and e _1 . (See Section [2.41 ) 

Dyer, Goldberg and Jerrum have classified the complexity of approximately computing 
#CSP(r) for Boolean constraint languages [9]. When all relations in T are affine, #CSP(r) 
can be computed exactly in polynomial time by the result of Creignou and Hermann dis- 
cussed above [5]. Otherwise, if every relation in T can be defined by a conjunction of pins 
(i.e., assertions v = or v = 1) and Boolean implications, then #CSP(r) is as hard to 
approximate as the problem #BIS of counting independent sets in a bipartite graph; other- 
wise, #CSP(r) is as hard to approximate as the problem #SAT of counting the satisfying 
truth assignments of a Boolean formula. Dyer, Goldberg, Greenhill and Jerrum have shown 
that the latter problem is complete for ^tP under appropriate approximation-preserving 
reductions (see Section 12. 4p and has no FPRAS unless NP = RP [8j , which is thought to 
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be unlikely. The complexity of #BIS is currently open: there is no known FPRAS but it is 
not known to be #P-complete, either. #BIS is known to be complete for a logically-defined 
subclass of #P with respect to approximation-preserving reductions [8]. 

1.4. Our result 

We consider the complexity of approximately solving Boolean #CSP problems when 
instances have bounded degree. Following Dalmau and Ford [6] and Feder [13] we consider 
the case in which i? zero = {0} and R one = {1} are available. We proceed by showing that 
any Boolean relation that is not definable as a conjunction of ORs or NANDs can be used 
in low-degree instances to assert equalities between variables. Thus, we can side-step degree 
restrictions by replacing high-degree variables with distinct variables asserted to be equal. 

Our main result, Corollary I6.6( is a trichotomy for the case in which instances have 
maximum degree d for some d ^ 25. If every relation in T is affine, then ^CSP^r U 
{-Rzeroj -Rone}) is solvable in polynomial time. Otherwise, if every relation in V can be defined 
as a conjunction of R ze ro, Rone and binary implications, then #CSP c ;(r U {R ZC ro, R<mc}) is 
equivalent in approximation complexity to #BIS. Otherwise, it has no FPRAS unless 
NP = RP. Theorem 16.51 gives a partial classification of the complexity when d < 25. In 
the new cases that arise here, the complexity is given in terms of the complexity of counting 
independent sets in hypergraphs with bounded degree and bounded hyper-edge size. The 
complexity of this problem is not fully understood and we explain what is known about it 
in Section [6l 

2. Preliminaries 

2.1. Basic notation 

We write a for the tuple (ai, . . . , a r ), which we often shorten to a = a\ . . . a r . We 
write a r for the r-tuple a ... a and ab for the tuple formed from the elements of a followed 
by those of b. The bit-wise complement of a relation R C {0, l} r is the relation R = 
{{ai 1, . . . , a r 1 ) | a G R}, where denotes addition modulo 2. 

We say that a relation R is ppp-definabl^ in a relation R' and write R ^ ppp R' if R 
can be obtained from R' by some sequence of the following operations: 

• permutation of columns (for notational convenience only); 

• pinning (taking sub-relations of the form R^c = {a G R \ ai = c} for some i and 
some c G {0, 1}); and 

• projection ("deleting the ith column" to give the relation {a% . . . ai-ia^i . . . a r \ 
a± . . . a r G R}). 

It is easy to see that ^ ppp is reflexive and transitive and that, if R ^ ppp R', then R can 
be obtained from R' by first permuting the columns, then making some pins and then 
projecting. 

We write R= = {00,11}, R + = {01,10}, R r = {01,10,11}, -Rnand = {00,01,10}, 
= {00,01,11} and i?^ = {00,10,11}. For k ^ 2, we write R- k = {0 k ,l k }, R R,k = 
{0, l} k \ {0 k } and i? N AND,fc = {0, l} k \ {l k } (i.e., A;-ary equality, OR and NAND). 



This should not be confused with the concept of primitive positive definability (pp-definability) which 
appears in algebraic treatments of CSP and #CSP, for example in the work of Bulatov [3]. 
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2.2. Boolean constraint satisfaction problems 

A constraint language is a set T = {Ri, ■ ■ ■ , R m } of named Boolean relations. Given a 
set V of variables, the set of constraints over T is the set Cons(V, T) which contains R(v) 
for every relation R G T with arity r and every v G V r . Note that v = v' and v ^ v' are 
not constraints unless the appropriate relations are included in V. The scope of a constraint 
-R(v) is the tuple v, which need not consist of distinct variables. 

An instance of the constraint satisfaction problem (CSP) over T is a set V of variables 
and a set C C Cons(V, V) of constraints. An assignment to a set V of variables is a function 
a-.V^s- {0,1}. An assignment to V satisfies an instance (V, C) if ( <x(?;i), . . ■ , er(iv) ) € R 
for every constraint R(v\, . . . ,v r ). We write Z{I) for the number of satisfying assignments 
to a CSP instance /. We study the counting CSP problem #CSP(r), parameterized by T, 
in which we must compute Z(I) for an instance / = (V, C) of CSP over V. 

The degree of an instance is the greatest number of times any variable appears among 
its constraints. Note that the variable v appears twice in the constraint R(v, v). Our specific 
interest in this paper is in classifying the complexity of bounded-degree counting CSPs. For 
a constraint language T and a positive integer d, define #CSPd(r) to be the restriction of 
#CSP(r) to instances of degree at most d. Instances of degree 1 are trivial. 

Theorem 2.1. For any T, #CSPi(r) G FP. ■ 

When considering ^CSP^ for d 2, we follow established practice by allowing pinning 
in the constraint language [6l[l3]. We write i? zer o = {0} and i2 ne = {1} for the two 
singleton unary relations. We refer to constraints in i? zcro and i? ne a s pins. To make 
notation easier, we will sometimes write constraints using constants instead of explicit pins. 
That is, we will allow the constants and 1 to appear in the place of variables in the scopes 
of constraints. Such constraints can obviously be rewritten as a set of "proper" constraints, 
without increasing degree. We let r p i n denote the constraint language {i? ze ro, ^one}- 

2.3. Hypergraphs 

A hypergraph H = (V, E) is a set V = V(H) of vertices and a set E = E(H) C V(V) 
of non-empty hyper-edges. The degree of a vertex v G V(H) is the number d(v) = \{e G 
E(H) | v G e}| and the degree of a hypergraph is the maximum degree of its vertices. If 
w = max{|e| | e G E(H)}, we say that H has width w. An independent set in a hypergraph 
H is a set S C V(H) such that e ^ S for every e G E(H). Note that an independent set 
may contain more than one vertex from any hyper-edge of size at least three. 

We write ^io-HIS for the problem of counting the independent sets in a width-w; hy- 
pergraph H, and #w-HISd for the restriction of #w-HIS to inputs of degree at most d. 

2.4. Approximation complexity 

A randomized approximation scheme (RAS) for a function / : X* — > N is a probabilistic 
Turing machine that takes as input a pair (x, e) G X* x (0, 1), and produces, on an output 
tape, an integer random variable Y with Pr(e _e ^ Y/f(x) ^ e e ) ^ |o A fully polynomial 
randomized approximation scheme (FPRAS) is a RAS that runs in time poly(|x|, e _1 ). 

^The choice of the value | is inconsequential: the same class of problems has an FPRAS if we choose any 
probability p with | < p < 1 [18] . 
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To compare the complexity of approximate counting problems, we use the AP-reductions 
of [8]. Suppose / and g are two functions from some input domain S* to the natural numbers 
and we wish to compare the complexity of approximately computing / to that of approxi- 
mately computing g. An approximation-preserving reduction from / to g is a probabilistic 
oracle Turing machine M that takes as input a pair (x,e) £ £* x (0,1), and satisfies the 
following three conditions: (i) every oracle call made by M is of the form (w, 5) where 
w € S* is an instance of g, and < 5 < 1 is an error bound satisfying o" -1 ^ poly(|x|, e _1 ); 
(ii) M is a randomized approximation scheme for / whenever the oracle is a randomized 
approximation scheme for g; and (iii) the run-time of M is polynomial in \x\ and e _1 . 

If there is an approximation-preserving reduction from / to g, we write / ^ap 9 and 
say that / is AP-reducible to g. If g has an FPRAS, then so does /. If / ^ap 9 and 
9 ^AP /; then we say that / and g are AP-interreducible and write / =ap 9- 

3. Classes of relations 

A relation R C {0, l} r is affine if it is the set of solutions to some system of linear 
equations over GF2. That is, there is a set £ of equations in variables xi, ... ,x r , each of 
the form c, where © denotes addition modulo 2 and c € {0, 1}, such that 

a £ R if, and only if, the assignment xi t-t a±, . . . ,x r a r satisfies every equation in £. 
Note that the empty and complete relations are affine. 

We define IM-conj to be the class of relations defined by a conjunction of pins and 
(binary) implications. This class is called IM2 in [9]. 

Lemma 3.1. If R € IM-conj is not affine, then R^ ^ PPP R- ■ 

Let OR-conj be the set of Boolean relations that are defined by a conjunction of pins 
and ORs of any arity and NAND-conj the set of Boolean relations definable by conjunctions 
of pins and NANDs (i.e., negated conjunctions) of any arity. We say that one of the defining 
formulae of these relations is normalized if no pinned variable appears in any OR or NAND, 
the arguments of each individual OR and NAND are distinct, every OR or NAND has at 
least two arguments and no OR or NAND's arguments are a subset of any other's. 

Lemma 3.2. Every OR-conj (respectively, NAND-conj) relation is defined by a unique 
normalized formula. m 

Given the uniqueness of defining normalized formulae, we define the width of an OR-conj 
or NAND-conj relation R to be wd(-R), the greatest number of arguments to any of the 
ORs or NANDs in the normalized formula that defines it. Note that, from the definition of 
normalized formulae, there are no relations of width 1. 

Lemma 3.3. If R £ OR-conj has width w, then i?oR,2 5 • • • > RoR,m ^ppp R- Similarly, if 
R G NAND-conj has width w, then -Rnand,2, • • • , -Rnand,™ ^ppp R- ■ 

Given tuples a,b £ {0, we write a ^ b if a% ^ bi for all is [1, r]. If a ^ b and a 7^ b, 
we write a < b. We say that a relation R C {0, l} r is monotone if, whenever a G R and 
a ^ b, then b G R. We say that R is antitone if, whenever a £ R and 6 ^ a, then b € R. 
Clearly, R is monotone if, and only if, R is antitone. Call a relation pseudo-monotone 
(respectively, pseudo- antitone) if its restriction to non-constant columns is monotone (re- 
spectively, antitone). The following is a consequence of results in [191 Chapter 7.1.1]. 

Proposition 3.4. A relation R C {0, l} r is in OR-conj (respectively, NAND-conj) if, and 
only if, it is pseudo-monotone (respectively, pseudo-antitone) . m 
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4. Simulating equality 

An important ingredient in bounded-degree dichotomy theorems [3] is expressing equal- 
ity using constraints from a language that does not necessarily include the equality relation. 

A constraint language T is said to simulate the fc-ary equality relation R =jk if, for some 
£ ^ k, there is a (T U r p i n )-CSP instance / with variables x\, . . . , X£ that has exactly m ^ 1 
satisfying assignments a with a{x\) = ■ ■ ■ = a(x k ) = 0, exactly m with cr(xi) = ■ ■ ■ = 
&{. x k) = 1 an d no other satisfying assignments. If, further, the degree of / is d and the 
degree of each variable xi, . . . , xj. is at most d — 1, we say that Y d-simulates R =ik - We say 
that r d-simulates equality if it d-simulates R =tk f° r all fc ^ 2. 

The point is that, if V d-simulates equality, we can express the constraint y\ = ■ ■ ■ = y r 
in T U r p j n and then use each yi in one further constraint, while still having an instance of 
degree d. The variables x k+ i, ■ ■ ■ , x i in the definition function as auxiliary variables and are 
not used in any other constraint. Simulating equality makes degree bounds moot. 

Proposition 4.1. If T d-simulates equality, then #CSP(r) ^ap #CSP<i(r u r p i n )- ■ 

We now investigate which relations simulate equality. 

Lemma 4.2. R G {0, l}' r 3-simulates equality if R= ^ PPP R, R^ ^ppp R or R^ ^ ppp R- 

Proof. For each k ^ 2, we show how to 3-simulate R= tk - We may assume without loss of 
generality that the ppp-definition of R=, R^ or R^ from R involves applying the identity 
permutation to the columns, pinning columns 3 to 3 + p — 1 inclusive to zero, pinning 
columns 3 + pto3 + p + q — 1 inclusive to one (that is, pinning p ^ columns to zero and 
q ^ to one) and then projecting away all but the first two columns. 

Suppose first that R= ^ ppp R or R^. ^ ppp R- R must contain a ^ 1 tuples that begin 
000 p l 9 , ft ^ that begin 0HP1 9 and 7 ^ 1 that begin 110 p l 9 , with /3 = unless we are 
ppp-defining We consider, first, the case where a = 7, and show that we can 3-simulate 
R= t k, expressing the constraint R= k { x ii ■ ■ ■ i x k) with the constraints 

J R(xix 2 p l 9 *), R(x 2 x 3 p l q *),..., i?(x fc _ix fc p l 9 *), i?(x fc xiO p l 9 *) , 

where * denotes a fresh (r — 2— p — q)-tuple of variables in each constraint. These constraints 
are equivalent to x\ = ■ ■ ■ = x k = x\ or to x\ — > . . . — > x k — > x\ so constrain the variables 
xi, . . . , Xfc to have the same value, as required. Every variable appears at most twice and 
there are a k solutions to these constraints that put x\ = ■ ■ ■ = x k = 0, 7^ = a k solutions 
with x\ = ■ ■ ■ = X). = 1 and no other solutions. Hence, R 3-simulates R =:k , as required. 

We now show, by induction on r, that we can 3-simulate R =tk even in the case that 
a ^ 7. For the base case, r = 2, we have a = 7 = 1 and we are done. For the inductive 
step, let r > 2 and assume, w.l.o.g. that a > 7 (a < 7 is symmetric). In particular, we have 
a ^ 2, so there are distinct tuples 00(Fl q a, and 00(P1 9 ?> and 110 p l IJ c in R. Choose j such 
that aj 7^ bj. Pinning the (2+p + q+j)th column of R to Cj and projecting out the resulting 
constant column gives a relation R' of arity r — 1 containing at least one tuple beginning 
OOOPf 3 and at least one beginning llO^l 9 : by the inductive hypothesis, R' 3-simulates R =jk . 

Finally, we consider the case that R^ ^ ppp R- R contains a ^ 1 tuples beginning 010 p l 9 
and /3 ^ 1 beginning 100 p l q . We express the constraint R =jk (xi, . . . ,x k ) by introducing 
fresh variables yi , . . . , y k and using the constraints 

R(x iyi 0ni*), R{x 2 y 2 W\H), R{x k ^ x y k ^lU) , R(x k y k 0n q *), 

R( yi x 2 0ni*), R(y 2 x 3 0ni*), R(y k _ lXk 0n^), R(y kXl (FlH). 
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There are a k f3 k solutions when x\ = ■ ■ ■ = Xf~ = (and y\ = ■ ■ ■ = = 1) and f3 k a k 
solutions when the xs are 1 and the ys are 0. There are no other solutions and no variable 
is used more than twice. ■ 

For c G {0, 1}, an r-ary relation is c-valid if it contains the tuple c r . 

Lemma 4.3. Let r ^ 2 and let R C {0, l} r be 0- and 1-valid but not complete. Then R 
3-simulates equality. m 

In the following lemma, we do not require R and R' to be distinct. The technique is to 
assert x\ = ■ ■ ■ = x^ by simulating the formula OR(xi,yi) A NAND(yi,X2) A OR(x2>y2) A 
NAND(y 2 , x 3 ) A • • • A OR(x k , y k ) A NAND(y fc , x x ). 

Lemma 4.4. // Rqr ^ppp R an d -Rnand ^ppp R'> then {R, R'} 3-simulates equality. m 



5. Classifying relations 

We are now ready to prove that every Boolean relation R is in OR-conj, in NAND-conj 
or 3-simulates equality. If Rq and i?i are r-ary, let Rq + R\ = {0a \ a G i?o} U {lo I o € R\}. 

Lemma 5.1. Let R$,R\ € OR-conj and let R = Rq + R\. Then R € OR-conj, R € 
NAND-conj or R 3-simulates equality. 

Proof. Let Rq and Ri have arity r. We may assume that R has no constant columns. If it 
does, let R' be the relation that results from projecting them away. R' = R' Q + R[ , where 
both Rq and R' t are OR-conj relations. By the remainder of the proof, R' € OR-conj, 
R' € NAND-conj or R' 3-simulates equality. Re-instating the constant columns does not 
alter this. For R without constant columns, there are two cases. 

Case 1. Rq C. R%. Suppose Ri is defined by the normalized OR-conj formula <j>{ in variables 
X2, • • • , x r+ \. Then R is defined by the formula 

4>o V {x\ = 1 A 0i) = (<f)o V x x = 1) A (0o V 0i ) = (0o V xi = 1) A 0i , (5.1) 

where the second equivalence is because 0o implies 0i, because Rq C R 1 . Rx has no 
constant column, since such a column would have to be constant with the same value in 
Rq, contradicting our assumption that R has no constant columns. There are two cases. 
Case 1.1. Rq has no constant columns. x\ = 1 is equivalent to OR(xi) and 0o contains 
no pins, so we can rewrite 0o V x\ = 1 in CNF. Therefore, (|5.ip is OR-conj. 
Case 1.2. Rq has a constant column. Suppose first that the fcth column of Rq is constant- 
zero. R\ has no constant columns, so the projection of R onto its first and (k + l)st columns 
gives the relation R^, and R 3-simulates equality by Lemma 14.21 Otherwise, all constant 
columns of Rq contain ones. Then 0o is in CNF, since every pin Xi = 1 in 0o can be written 
OR(xj). Thus, we can write 0o V x\ = 1 in CNF, so (|5.ip defines an OR-conj relation. 
Case 2. Rq ^ R\. We will show that R 3-simulates equality or is in NAND-conj. We 
consider two cases (recall that no relation has width 1). 

Case 2.1. At least one of Rq and R\ has positive width. There are two sub-cases. 
Case 2.1.1. R\ has a constant column. Suppose the /cth column of R\ is constant. If the 
kth column of Rq is also constant, then the projection of R to its first and (k + l)st columns 
is either equality or disequality (since the corresponding column of R is not constant) so R 3- 
simulates equality by Lemma B~2l Otherwise, if the projection of R to the first and (k + l)st 
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columns is then R 3-simulates equality by Lemma 14.21 Otherwise, that projection 

must be -Rnand- By Lemma [3731 and the assumption of Case 2.1, Ror is ppp-definable in 
at least one of Ro and Ri so R 3-simulates equality by Lemma 14.41 

Case 2.1.2. R\ has no constant columns. By Proposition 13. 4} R\ is monotone. Let 
a G i?o \ Rv by applying the same permutation to the columns of Ro and Ri, we may 
assume that a = i l r ~ i . We must have I ^ 1 as every non-empty r-ary monotone relation 
contains the tuple l r . Let b G R\ be a tuple such that Oj = &j for a maximal initial segment 
of [ljr]. By monotonicity of R\, we may assume that b = k l r ~ k . Further, we must have 
k < £, since, otherwise, we would have b < a, contradicting our choice of a ^ R\. 

Now, consider the relation R' = {ao a i • • • a i-k | aoO fc ai • • • a^_fcl r ~^ G i?}, which is the 
result of pinning columns 2 to (k + 1) of i? to zero and columns (r — £ + 1) to (r + 1) to 
one and discarding the resulting constant columns. R' contains 0^ _fc+1 and l* - ^ 1 but is 
not complete, since 10^ _fc ^ -R'. By Lemma 14.31 R 1 and, hence, i? 3-simulates equality. 
Case 2.2. Both Rq and R\ have width zero, i.e., are complete relations, possibly padded 
with constant columns. For i G [1, r], let R\ be the relation obtained from R by projecting 
onto its first and (i + l)st columns. Since R has no constant columns, R[ is either complete, 
R=, R^, Ror, Rnand, or R^. If there is a A; such that R' k is R=, R^, R^ or R^, then 
R=, R^ or i?^. is ppp-definable in R and hence R 3-simulates equality by Lemma l4~2l If 
there are k\ and k2 such that R' k = Ror and i?' fc2 = -RnanDj then R 3-simulates equality 
by Lemma 1-4.41 It remains to consider the following two cases. 

Case 2.2.1. Each R[ is either Ror or complete. R± must be complete, which contradicts 
the assumption that Ro % R\. 

Case 2.2.1. Each R[ is either -Rnand or complete. Ro must be complete. Let I = {i \ 
R>. = .Rnand}- Then R = /\ i€l NAND(x 1 ,x i+1 ), so R G NAND-conj. ■ 

Using the duality between OR-conj and NAND-conj relations, we can prove the corre- 
sponding result for Ro,R± G NAND-conj . The proof of the classification is completed by a 
simple induction on the arity of R. Decomposing R as Ro + R\ and assuming inductively 
that Ro and R\ are of one of the stated types, we use the previous results in this section 
and Lemma 14.41 to show that R is. 

Theorem 5.2. Every Boolean relation is OR-conj or NAND-conj or 3-simulates equality, m 
6. Complexity 

The complexity of approximating #CSP(r) where the degree of instances is unbounded 
is given by Dyer, Goldberg and Jerrum [9l Theorem 3]. 

Theorem 6.1. Let T be a Boolean constraint language. 

• // every R G V is affine, then #CSP(r) G FP. 

• Otherwise, ifTC IM-conj, then #CSP(r) = A p #BIS. 

• Otherwise, #CSP(r) = A p #SAT. 

Working towards our classification of the approximation complexity of #CSP(r), we 
first deal with subcases. The IM-conj case and OR-conj /NAND-conj cases are based on 
links between those classes of relations and the problems of counting independent sets in 
bipartite and general graphs, respectively [HUH], the latter extended to hypergraphs. 
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Proposition 6.2. If T C IM-conj contains at least one non-affine relation, then #CSPd(ru 
Tpm) =ap #BIS for all d ^ 3. ■ 

Proposition 6.3. Let R be an OR-conj or NAND-conj relation of width w. Then, for 

d^2, #w-ms d < A p #csp d ({i?} u r pin ). ■ 

Proposition 6.4. Let R be an OR-conj or NAND-conj relation of width w. Then, for 
d ^ 2, #CSPd({i?} U r p i n ) ^ap #^-HISfcrf, where k is the greatest number of times that 
any variable appears in the normalized formula defining R. m 

We now give the complexity of approximating #CSPd(r U r p ; n ) for d ^ 3. 

Theorem 6.5. Let T be a Boolean constraint language and let d ^ 3. 

• // every ReT is affine, then #CSP d (r U r pin ) G FP. 

• Otherwise, ifTQ IM-conj, then #CSP d (r U r pin ) = AP #BIS. 

• Otherwise, if T C OR-conj or T C NAND-conj, then let w be the greatest width 
of any relation in T and let k be the greatest number of times that any variable 
appears in the normalized formulae defining the relations ofT. Then #w-HISd ^ap 

#csp d (r u r pin ) ^ A p #w-ms kd . 

• Otherwise, #CSP d (r U r pin ) = A p #SAT. 

Proof. The affine case is immediate from Theorem 16. 11 (rur p ; n is affine if, and only if, T is.) 
Otherwise, if T C IM-conj and some R € T is not affine, then ^CSP^r U r p i n ) =ap #BIS 
by Proposition 16.21 Otherwise, if T C OR-conj or T C NAND-conj, then #w-HISd ^ap 
#CSP d (r U r pin ) ^ A p #w-}ilS kd by Propositions E2\ and El 

Finally, suppose that T is not affine, T IM-conj, T ^ OR-conj and T ^ NAND-conj. 
Since (rur p ; n ) is neither affine or a subset of IM-conj, we have #CSP(rur p i n ) =ap #SAT 
by Theorem 16.11 so. if we can show that T d-simulates equality, then #CSP d (T U r p i n ) =ap 
#CSP(r U r p i n ) by Proposition 14.11 and we are done. If T contains a R relation that is 
neither OR-conj nor NAND-conj, then R 3-simulates equality by Theorem 15.21 Otherwise, 
r must contain distinct relations R\ € OR-conj and R2 € NAND-conj that are non-affine 
so have width at least two. So V 3-simulates equality by Lemma 14.41 ■ 

Unless NP = RP, there is no FPRAS for counting independent sets in graphs of 
maximum degree at least 25 [7], and, therefore, no FPRAS for ^w-HIS^ with r ^ 2 and 
d ^ 25. Further, since #SAT is complete for #P under AP-reductions [8], ^SAT cannot 
have an FPRAS unless NP = RP. From Theorem l6. 51 above we have the following corollary. 

Corollary 6.6. Let T be a Boolean constraint language and let d ^ 25. 

• // every RgT is affine, then #CSP d (r U r pin ) € FP. 

• Otherwise, ifTC IM-conj, then #CSP d (r U r pin ) = A p #BIS. 

• Otherwise there is no FPRAS for #CSP d (r U T pin ), unless NP = RP. ■ 

r U r p ; n is affine (respectively, in OR-conj or in NAND-conj) if, and only if T is, so the 
case for large-degree instances {d ^ 25) corresponds exactly in complexity to the unbounded 
case [9]. The case for lower degree bounds is more complex. To put Theorem l6.5l in context, 
we summarize the known approximability of #w-HISd, parameterized by d and w. 

The case d = 1 is clearly in FP (Theorem 12. ID and so is the case d = w = 2, which 
corresponds to counting independent sets in graphs of maximum degree two. For d = 2 and 
width w ^ 3, Dyer and Greenhill have shown that there is an FPRAS for ^tu-HIS^ |llj . 
For d = 3, they have shown that there is an FPRAS if the the width w is at most 3. 



APPROXIMATING BOUNDED-DEGREE BOOLEAN #CSP 



333 



Degree d 


Width w 


Approximability of #u)-HIS^ 


1 


> 2 


FP 


2 


2 


FP 


2 


> 3 


FPRAS PI] 


3 


2,3 


FPRAS PI] 


3,4,5 


2 


PTAS [30] 


6,..., 24 


^2 


The MCMC method is likely to fail [7J 


^ 25 


> 2 


No FPRAS unless NP = RP [7] 



Table 1: Approximability of #w-HlSd (still open for all other values of d and w). 

For larger width, the approximability of #u;-HIS3 is still not known. With the width 
restricted to w = 2 (normal graphs), Weitz has shown that, for degree d € {3,4,5}, there 
is a deterministic approximation scheme that runs in polynomial time (a PTAS) [30]. This 
extends a result of Luby and Vigoda, who gave an FPRAS for d ^ 4 [24] . For d > 5, 
approximating #u>-HISd becomes considerably harder. More precisely, Dyer, Frieze and 
Jerrum have shown that for d = 6 the Monte Carlo Markov chain technique is likely to 
fail, in the sense that "cautious" Markov chains are provably slowly mixing [7J. They 
also showed that, for d = 25, there can be no polynomial-time algorithm for approximate 
counting, unless NP = RP. These results imply that for d € {6, . . . , 24} and w ^ 2 the 
Monte Carlo Markov chain technique is likely to fail and for d ^ 25 and w ^ 2, there can 
be no FPRAS unless NP = RP. Table [U summarizes the results. 

Returning to bounded-degree #CSP, the case d = 2 seems to be rather different to 
degree bounds three and higher. This is also the case for decision CSP — recall that 
degree-a! CSP(rur p i n ) has the same complexity as unbounded-degree CSP(rur p i n ) for all 
O 3 [5], while degree-2 CSP(ru 

Ppin) is often easier than the unbounded-degree case [61 113] 
but the complexity of degree-2 CSP(r U r p i n ) is still open for some E 

Our key techniques for determining the complexity of #CSPd(r U E p i n ) for d ^ 3 were 
the 3-simulation of equality and Theorem 15.21 which says that every Boolean relation is in 
OR-conj, in NAND-conj or 3-simulates equality. However, it seems that not all relations that 
3-simulate equality also 2-simulate equality so the corresponding classification of relations 
does not appear to hold. It seems that different techniques will be required for the degree-2 
case. For example, it is possible that there is no FPRAS for #CSPs(r U r p j n ) except when 
r is affine. However, Bubley and Dyer have shown that there is an FPRAS for degree-2 
#SAT, even though the exact counting problem is #P-complete pQ. This shows that there 
is a class C of constraint languages for which #CSP2(r U r p i n ) has an FPRAS for every 
r £ C but for which no exact polynomial-time algorithm is known. 

We leave the complexity of degree-2 #CSP and of #BIS and the the various parame- 
terized versions of the counting hypergraph independent sets problem as open questions. 
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